Plagiarism can be of many different natures, rangingfrom copying texts to adopting ideas, without giving creditto its originator. This paper presents a new taxonomy of plagiarismthat highlights differences between literal plagiarism andintelligent plagiarism, from the plagiarist’s behavioral point ofview. The taxonomy supports deep understanding of different linguisticpatterns in committing plagiarism, for example, changingtexts into semantically equivalent but with different words andorganization, shortening texts with concept generalization andspecification, and adopting ideas and important contributions ofothers. Different textual features that characterize different plagiarismtypes are discussed. Systematic frameworks and methodsof monolingual, extrinsic, intrinsic, and cross-lingual plagiarismdetection are surveyed and correlated with plagiarism types,which are listed in the taxonomy. We conduct extensive studyof state-of-the-art techniques for plagiarism detection, includingcharacter n-gram-based (CNG), vector-based (VEC), syntax-based(SYN), semantic-based (SEM), fuzzy-based (FUZZY), structuralbased(STRUC), stylometric-based (STYLE), and cross-lingualtechniques (CROSS).Our study corroborates that existing systemsfor plagiarism detection focus on copying text but fail to detect intelligentplagiarism when ideas are presented in different words.
展开▼